High-level approaches to confidence estimation in speech recognition
نویسندگان
چکیده
We describe some high-level approaches to estimating confidence scores for the words output by a speech recognizer. By “high-level” we mean that the proposed measures do not rely on decoder specific “side information” and so should find more general applicability than measures that have been developed for specific recognizers. Our main approach is to attempt to decouple the language modeling and acoustic modeling in the recognizer in order to generate independent information from these two sources that can then be used for estimation of confidence. We isolate these two information sources by using a phone recognizer working in parallel with the word recognizer. A set of techniques for estimating confidence measures using the phone recognizer output in conjunction with the word recognizer output is described. The most effective of these techniques is based on the construction of “metamodels,” which generate alternative word hypotheses for an utterance. An alternative approach requires no other recognizers or extra information for confidence estimation and is based on the notion that a word that is semantically “distant” from the other decoded words in the utterance is likely to be incorrect. We describe a method for constructing “semantic similarities” between words and hence estimating a confidence. Results using the U.K. version of the Wall Street Journal are given for each technique.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملEfficient data selection for speech recognition based on prior confidence estimation using speech and context independent models
This paper proposes an efficient data selection technique to identify well recognized texts in massive volumes of speech data. Conventional confidence measure techniques can be used to obtain this accurate data, but they require speech recognition results to estimate confidence. Without a significant level of confidence, considerable computer resources are wasted since inaccurate recognition re...
متن کاملA confidence measure for speech recognition systems based on two maximum entropy approaches
We implement a confidence estimation stage for a speech recognition system by using two maximum entropy models. These models takes information from various sources including a set of scores which have proved to be useful in confidence estimation tasks, and efficiently combine those scores. Two different approaches of the maximum entropy model are developed. First a basic model which takes advan...
متن کاملPragmalinguistic and Sociopragmatic Recognition of High and Low Level EFL Learners
This study investigated the effects of English as foreign language (EFL) proficiency on what the authors of this study called pragmalinguistic and sociopragmatic recognition of EFL learners. To elicit the data, the study used two types of pragmatic measures: a pragmalinguistic recognition (PLR) test and a sociopragmatic recognition (SPR) test. Both tests were developed by the researchers of thi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 10 شماره
صفحات -
تاریخ انتشار 2002